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Abstract 
The  problem  of  testing  for  serial  dependence  of  residuals  Ln  limited  depend- 
ent variable  models  is  studied.   The  Tobit  and  truncated  normal  models  are  considered, 
with  particular  emphasis  on  the  former,  although  the  methods  used  could  be  applied 
in  some  other  models.   The  work  is  motivated  in  part  by  a  simulation  study  which 
shows  that  the  Gaussian  maximum  likelihood  estimators   based  on  indenendence 
t,which  are  known  to  be  consistent  but  inefficient  when  the  residuals  are  actual  ly 
serially  dependent)  can  be  seriously  affected  in  small  and  moderate  samples  by  serin  1 
dependence.   Because  the  usual  tests  for  serial  dependence  of  residuals  are  invalid, 
we  consider  tests  based  on  the  likelihood  under  autocorrelation.  But  because  of  the 
intractability  of  the  latter,  the  Wald  and  likelihood  ratio  tests  seem  computationally 
unfeasible  and  so  attention  is  directed  to  Lagrange  multiplier  tests.   These  are 
developed  against  a  wide  class  of  alternatives,  and  their  powers  in  finicd  samples  are 
examined  by  simulations.   Generally,  efficient  tests  for  higher  order  serial  correl- 
ation against  autocorrelated  null  hypotheses  prove  to  be  computationally  unfeasible. 
However,  we    obtain  an  easily  computable  test  statistic  for  a  certain  model  contain- 
ing lagged  dependent  variables  . 
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1.  Introduction 
Limited  dependent  variable  (LDV)  models  are  frequently  employed  to  describe 
variables  that  are  restricted  to  a  pi-oper  subset  of  the  real  line  (for  example  to 
non-negative  values,   as   in  supply  of  and  demand  for  given  goods),  and  possibly 
take  some  values  with  positive  probability.   Following  earlier  study  of  the  censored 
normal  model  by  Cohen  [9]  and  others,  Tobin  [30]  introduced  such  a  model  (the  Tobit 
model)  in  an  investigation  of  household  expenditure  on  durables.   This  model  and 
variations  have  been  widely  used  in  cross  sectional  studies  (see  for  example  Fair 
[12],  Quester  and  Greene[22]).   LDV  models  can  also  be  used  for  analyzing  time  series 
data,  as  in  Grether  and  Maddala  [15],  Robinson  [24].  As  is  well  known,  autocorrel- 
ation in  economic  time  series  is  not  always  adequately  accounted  for  by  the  exogenous 
variables  and  the  residuals  may  be  serially  dependent,  particularly  when  the  model 
contains  no  lagged  endogenous  variables.   The  Tobit  estimators  obtained  by  maximizing 
the  likelihood  under  independence  and  Gaussianity  assumptions  are  known  to  be  con- 
sistent, but  asymptotically  inefficient,  when  the  residuals  are  serially  dependent 
(Robinson  [26]),  as  in  the  classical  regression  model.   But  the  standard  tests  for 
serial  independence  used  for  the  latter  model  (such  as  the  Durbin-Watson  test) 
are  no  longer  valid  in  the  LDV  situation.   And  as  noted  in  [19],  [26],  asymptotically 
efficient  estimators  under  serial  dependence  are  in  general  very  difficult  to  compute 
and  their  asymptotic  statistical  properties  seem  difficult  to  obtain,  unlike  in  the 
classical  regression  model.   For  the  latter  reason  the  only  asymptotically  locally 
most  powerful  tests  available  are  Lagrange  multiplier  (LM)  tests  (Rao  [23],  SLlvey 
[29])  because  these  require  estimation  of  the  model  only  under  the  null  white  noise 
hypothesis.  Such  tests  are  proposed  in  Sections  4  and  6,  for  the  Tobit  and  truncated 
models  respectively,  against  a  class  of  autocorrelated  alternatives,  which  is  intro- 
duced in  Section  3.   The  power  in  small  and  moderate  samples  of  the  Tobit  test  is 
studied  in  Section  5.  In  Section  7  some  difficulties  associated  with  introducing 
lagged  dependent  variables   in  LDV  models  are  described.   To  begin  with,  however, 
some  simulations  will  illustrate  the  effect  of  serial  correlation  on  standard 
Tobit  estimators. 
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2.  Effects  of  Serial  Correlation 
We  consider  the  model 

(2.1)  yt  =  max   (B*xt  ♦  ut>  0) 

where  the  column  vectoTS  x  and  6  consist  respectively  of  K  known  and  unknown  constants, 

2 

If  the  u  are  independent  N(0,a"),  then  the  log  likelihood  function  on  the  basis 

of  T  observations  is  given  by 

T 

(2.2)  Z     {(1-w  )  in  C1-F(b'x  ,o)]  ♦  w  In  f(y-s'x.a)} 

t=l      z  z         z 

where 

(2.3)  w  =  0  if  y  =0;w  =lify  >0 

and  f(.,o),  F(.,a)  are  respectively  the  N(0,a  )  probability  density  function  (pdf) 
and  distribution  function  (df) .   Let  6  =  (S   ,  °")   be  the  value  of  8  =  (S  ,  O  ) 
maximizing  (2.2).   Because  serious  asymptotic  statistical  theory  is  only  a  secondary 
aspect  of  the  paper  we  use  no  special  notation  to  denote  true  parameter  values. 


Amemiya  [1],  lloadley  [16]  established  the  consistency  and  asymptotic 
normality  and  efficiency  of  9,  under  regularity  conditions.  The  consistency  property  is 
not  robust  to  departures  from  normalit>  (as  demonstrated  by  Arabmazar  and  Schmidt  [4], 
Robinson  [26],  and  White  [32]\or  horaoscedasticity  (Uarner  [3],  Arahmazar  and 
Schmidt  [3])  but  it  continues  to  hold  under  a  wide  class  of  serially 

dependent  u  [26].   Nevertheless,  9  will  be  asymptotically  inefficient  in  the  latter 
situation,  and  che  asympcotic  covariance  matrix  under  serial  dependence  is  given 
in  [25].   There  is  also  the  possibilicy  that  serial  correlation  could  affect 
finite-sample  bias,  and  because  information  on  this  question,  and  finite-sample 
inefficiency,  cannot  be  gained  analytically   we  turn  to  Monte  Carlo  simulations. 
(All  simulations  in  the  paper  were  carried  out  on  the  Australian  National  University's 
UNIVAC  1100  computer.) 

The  data  were  generated   according  to  the  simple  censored  normal  model 
(2.4)  y     =   max  (S  ♦  u  ,  0), 

but  with  u.a  stationary  Gaussian  process  with  j th  autocorrelation  p.  =  a  ,  j-1. 
Thus  i   has  a  single  element,  and  so  we  can  denote  this  simply  by  8,  its  estimator 
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being  3.  Although  (2.4)  is  not  of  direct  practical  use  for  economic  data,  we 
employ  it  for  illustrative  purposes  partly  because  of  its  simplicity,  partly  because 
it  is  an  important  leading  case  in  the  censored  data  literature  (see  e.g.  [9])  and 
thus  of  some  interest  in  itself,  and  partly  because  it  has  been  used  elsewhere  in  the 
econometric  literature,  Nelson  [21].  Note,  incidentally,  that  in  the  uncensored 
version  of  (2.4),  y  =  3  +  u  ,  3  becomes  the  sample  mean  of  the  y  ,  which  is  well  known 
to  be  asymptotically  efficient  in  the  presence  of  quite  general  serial  ly  correlated  u  . 
This  phenomenon,  which  in  the  uncensored  case  appears  also  in  the  presence  of  polynomial 
and  trigonometric  regressors,  does  not  occur  in  (2.1),  (2.4),  however,  as  is  evident 
from  the  covariance  matrix  formulae  in  [26], 

Our  aim  in  the  simulations  was  to  study  the  effects  of  different  degrees  of 
serial  correlation  on  the  bias  and  variance  of  S  and  a.   Five  values  of  a  were  chosen, 

0.0(0.2)0.8  ,  the  white  noise  innovations  in  u  being  generated   by  subroutines  of 

2 

Nay  lor  et  al  [20], and  the  variance  T~  of  the  innovations  being  adjusted  with  changes 

2    2     2 

in  a  to  keep  0"~  =  x"/(l-a")  fixed  at  2.   To  represent  both  weak  and  strong  censoring 

we  took  two  values  of  3,  1.0  and  -1.0;  P(ut*  1  >  0)  =  0.73  and  P(ut  -  1  >  0)  =  0.22. 
The  T-values  employed  were  50,  50,  70,  100  and  150. 

For  every  combination  of  T,  8  and  a  values  8  was  computed  on  each  of  500 
replications,  by  Fair's  [11]  algorithm.  The  sample  biases  and  variances  were  calcul- 
ated, and  displayed  as  the  upper  and  lower  entries  in  the  cells  in  Table  I  (B  =  1.0) 
and  Table  II  (B  =  -1.0).  The  following  features  are  identified.  (1)  In  Table  I,  S 
and  n  always  underestimate,  on  the  average;  the  bias  increases  with  a,  but  not  always 

At 

monotonically  and  the  results  tend  to  reflect  the  consistency  of  9  established  in  [26], 
although  at  the  same  time  the  decay  in  bias  as  T  increases  seems  disappointingly  slow, 
whatever  the  extent  of  serial  correlation.   (2)   Under  the  heavier  censoring,  Table  II 
shows  a  similar  o  bias,  but  the  bias  of  3  varies  substantially,  and  somewhat  surprisingly 
becomes  positive  for  large  a.   (3)  In  both  Tables,  the  variances  of  the  estimates 
decrease  steadily  with  T,  and  again  this  is  to  be  expected  from  consistency.  However, the 
variances  also  increase  quite  rapidly  with  a  (for  example  in  Table  I  with  T  =  30  there 
is  a  539?;  increase  for  3  as  a  goes  from  0  to  0.3),  and  the  decrease  in   variance  with 
T  is  in  some  cases  less  apparent  for  large  a  than  for  small  a.   For  instance,  in  Table 
II  as  T  increases  from  30   to  150,  the  3  variance  decreases  by  87%  and  66%,  for  a  =  0 
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and  0.8  respectively.  A  broad  conclusion  to  be  drawn  from  the  simulations  is  that 

larger  sample  sizes  are  needed  to  get  reasonable  estimates  when  serial  correlation  is 

present,  than  when  there  is  independence:  note  for  example  that  in  Table  I  the  bias 

and  variance  for  a  =  0.8,  T  =  100  are  larger  than  those  for  a  =  0,  T  =  30.   Finally, 

as  a  broad  comparison  of  Tables  I  and  II,  we  detect  that  an  increase  in  the  degree  of 

censoring  has  a  weak  tendency  to  increase  bias,  and  a  strong  tendency  to  increase 

variance. 

(Tables  I  and  II  about  here) 

The  simulation  results,  along  with  the  theoretical  ones  in  r26]  ,  point  to  the 
desirability  of  testing  for  serial  correlation  before  treating  9  as  if  it  were  based 
on  independent  observations.  Three  testing  procedures  that  are  commonly  used  because 
of  their  optimal  local  power  properties  are  the  Wald,  likelihood  ratio  (LR) ,  and  LM 
tests.  The  first  twoboth  require  maximization  of  the  likelihood  under  the  alternative, 
serially  correlated,  hypothesis,  and  as  shown  by  Robinson  [25],  [26 J,  (and  see  below) 
such  a  likelihood  involves,  for  the  Tobit  model  (2.1),  a  multinoiinal  df  of  dimension 
equal  to  the  total  numberof  zero  y   and  therefore  presents  formidable  problems,  both  in 
computation  and  in  the  statistical  theory.  Robinson  [24],  [25,  Theorem4.1]  indicated  that 
for  certain  patterns  of  censoring  and  models  involving  autoregressive  structures,  the  multi- 
normal  distribution  function  can  be  expressed  in  terms  of  integrals  ot   smaller  dimension, 
and,  when  censored  values  are  very  sparse,  possibly  as  a  product  of  univariate  normals.  In 
the  lattersituation,  at  least, the  likelihood  is  certainly  tractable,  and  Wald  and  LR 
statistics  can  be  computed.   Because  data  patterns  of  the  required  type  will  be  few 
and  far  between  in  practice,  however,  we  have  decided  to  ignore  such  special  situations, 
and  for  the  truncated  model  no  such  simplifications  can  ever  arise.   Thus  we  have 
effectively  disqualified  the  Wald  and  LR  tests  from  our  considerations.  We  are  left 
with  the  LM  test,  which  invoLves  maximizing  only  the  null  hypothesis  white  noise 
likelihood,  and  is  thence,  as  we  shall  show,  easily  computed.   Indeed  the  circumstances 
studied  in  the  present  paper  provide  spectacular  examples  of  the  difference  in  comp- 
utational effort  between  the  LM  test  on  the  one  hand,  and  the  Wald  and  LR  tests  on  the 
other.  A  general  test  for  missnecification  in  the  Tobit  model  has  been  proposed  by 
Nelson  [21 ];  because  his  test  relies  on  inconsistency  of  the  Tobit  estimators  under  the 
alternative,  it  is  not  appropriate  for  testing  for  serial  correlation. 


-  ft  - 
5.      A  General   Class   of  Alternatives 
As   a  class   of  alternatives   to   independence  we  take,   as   in  L26],   u     to  be 
generated  by  a  stationary  Gaussian  process.      The jth  autocorrelation  of  u   ,  0.(<fi), 
is   a  uniquely  defined   function     of  a  p- dimensional   column  vector  <J/,   which   is   function- 
ally unrelated  to  8.     We  assume  for  a  unique  value  of  ty,  which  we   take     with  no   loss  of 
generality  to  be  the  vector  of  zeros   0,   that  p.(0)   =  0,   j   =   1,2,....      Thus  we  consider 
testing 

(3.1)  IIQ    :  ty  =  0,   versus  H1    :   i|/  i  0. 

? 
We  assume  that  in  a  neighbourhood  of  ty   =  0  the  spectral  density  (o"/2Tr)S(oj;ii») ,  where 

oa 

(3.2)  S(u;M   =   1  +  2  E     p.(ii>)  cosjoi,      -tt<lj<ti , 

j  =  l     J 

exists,   and  is   differentiable  in  tj/,   the  derivatives  at    ty  -  0  being  square-integrable; 

moreover  that   the  matrix 

,.    „                                                /   3   log  S(u>;0)3    log  S(oj;0)'du) 
(J'J)  _/     ■%»  7$ 

is   oositive  definite.  The  latter  is  a  familiar  identifiabi lity  condition  for  time 

series  models,  automatically  excluding,  for  example,  mixed  autoregressive  moving  average 

alternative  hypotheses  and  would  of  course  figure  in  tests  of  models  other  than  those 

studied  here. 

The  two  special  cases  of  (3.2)  of  most  interest  are  thepth  order  autoreeression  (AR(p' 

(3.4)  S(w;W  «  ll-  I     *.  elju'f2    ,   1- I  *.*  /  0,  |z|sl 

and  the  pth  order  moving  average  (MA(p)) 

P  -> 

(3.5)  S(o;M  «  II  ♦  I  *.  e  1JV 

7=1  J 

where  there  is  a  factor  depending  on  $  but  not  on  u  in  each  case.  The  latter  is  due 

2 
to  the  fact  that  a"  represents  not  an  innovations  variance  but  V(u  ) ,  irrespective  of 

whether  or  not  there  is  serial  correlation.  As  in  other  applications  of  the  LM  test 
(e.g.  Breusch  [31,  Godfrey  l"13],  [14])  the  same  as  s tatistics  results  under  boch  (3.4) 
and  (3.5).   This  statistic   falls  out  quickly  from  the  general  form  we  shall  derive,  and 
a  reader   interested  in  other  alternatives  -  such  as  processes  with  "gaps",  as  in 
seasonal  situations,  or  some  of  the  non-ARMA  processes  studied  in  the  time  series 
literature  -  can  readily  apply  the  general  formulae.   Our  class  of  alternatives  and  the 
conditions  above  are  relevant  to  many  other  situations ,  besides  LDV  ones,  where  serial 
correlation  is  to  be  tested. 


Another  class  of  stationary  alternatives  has  been  suggested  by  Hosking  [17], 
in  LM  testing  of  ARMA  models   for  stationary   time  series.  In  effect,  his  null  rational 

spectrum  is,  under  the  alternative,  multiplied  or  divided  by  a  factor  |P(«)|~,  where 

P     ■  ■ 
Pfui'  =  1  *  a(eluJ)  Z.   *.e1JW  and  a(.  )  is  either  a  completely  known  (not  necessarily 

rational)  function  or  a  function  of  the  serial  correlation  parameters  of  the  null  model 
and  the  *.  are  unknown.  When  HQ  is  white  noise  the  alternative  spectrum  is  thus  propor- 
tional to  |P(u)|2  or  |PO)f2.  While  this  class  includes  alternatives  such  as  (3.4)  and 
(3.5)  the  requirement  that  a  be  completely  specified  and  P  be  linear  in  the  unknown 
parameters  rules  out  a  number  of  alternatives  possible  under  our  approach.   For  example, 
while  !P(u) I  ~  can  represent  an  AR  spectrum  with  some  coefficients  a  priori  zero, 

it  does  not  permit  the  most  economical  parameterization  when  this  arises  from 
multiplication  of  standard  autoregressive  and  seasonal  operators  (unless  one  of  the 

operators  is  known)  and  therefore  will  not  yield  an  asymptotically  locally  most  powerful 

test.  Hosking  [17]  also  assumes,  in  effect,  that  autocorrelations  decay  exponentially 

even  under  H  ,  a  stronger  assumption  than  our  smoothness  conditions  on  S(w;^).   However, 

Hosking 's  approach  is  entirely  adequate  for  the  purposes  of  his  paper. 

Define  R('10  to  be  the  T  *  T  Toeplitz  matrix  with  (j,j  +  k)th  element  p.  (9). 

3y  $(.;u,Q),  S(.;u,£)  we  shall  mean  the  pdf  and  df  respectively  of  a  T-variate  normal 

variable  with  mean  u  and  covariance  matrix  Q. 

4.   A  Test  for  the  Tobit  Model 

th  ' 

If  X  is   the  T  x  k  matrix  with  t       row  x   ,    the    log-likelihood  based  on   the  model 

t'  5 

(2.1),    (3.2),   as   obtained  in  [25],    [26],    is 

T  1-w 

(4.1)  L(9/«   =  in  /..?  /  <Ky;X8,c2R(*))     B     (dy)         C. 


t-1         z 


T 

:he   integral  having  dimension  T-U  where  U  =  I,   w   , 
o  5  It 


(This   follows    from  the   fact  that  y  has   df  $(y ;XS,a"R('40) 

when  all   elements   of  y  are  nonnegative,   and  df  zero  otherwise.)    Of  course   (4.1)    reduces 
to   (2.2)   under  H_,   and  reduces   to  the  usual     Gaussian   type  of  log   likelihood  when   there 
is   no  censoring.    For  iji  ?  0      L(9,^)    can  be  expressed  in  terms   of  a  U-dimensional   normal 
pdf,   and  a  T-U-dimensional   conditional  normal  df  [25],   [26  1,   but   for  present  purposes   it 
is   convenient  to  use  the  form  (4.1). 

The   LM  statistic   for  testing   (5.1)    is 
(4.2)  LM  =    l(9),Mr9)"1JlO) 

Where  1(6)    =    (3/3*)L(3,0)  ,  m(6)    =   (3/39) L(9,0) , 


M(8)  =  E{4(8)A(e)'}  -  EU(e)m(9)'}  [E{m(9)m(9)  }]"  E{m(e)£(9)  }, 
and  expectations  are  under  hL  .  The  computation  of  9,  and  that  of  LM,  involves 

evaluation  of  normal  integrals  of  unit  dimension  only.  To  derive  a  formula  for  LM 

2 
we  need  derivatives  of  $(y ;XB,a'R(W) .  Because  S(io;<J;)  is  differentiate  the  derivatives 

p..  =  (3/3iMp.(0)   exist,  and  let  R.  be  the  T  x  T  Toeplitz  matrix  with  (k,j  ♦  k)th 

element  p..  for  j  i   u,  and  zero  diagonal  elements.  By  the  chain  rule  (3/3^.)R(0)"  =  -R- 

and  because  all  off-diagonal  elements  of  R(0)  have  ;ero  cofactors,  j.3/3^. )  lR(0)  |  =  0. 


It  follows  that 


(3/3'^Un  $(y;XB,a2R(0))=  h  a"2(y-XB)  Vfy-XB) 


and  so  (3/3^. )exp{L(9,0) }  is  given  by 


T      1-w 

^■2/,.?./(v-xB)'R.(y-XB)*(y;XB,a2R(0))n  (dyt)  z 

,  T  1-w. 

Ua~"  ZZ  ps_tji  /•••/  zs  :c  n  {f(zjfff)  (dz.)   J } 


with  change  of  variables  z  =  y  -3  x  ,  the  multiple  integral  being  over  -<*>  <   z.    <  -B  x. 

for  those  j  such  that  w.  =  0  and  the  double  sum  over  s,t  «  1.....T,  s  t    t.   The  last 

displayed  expression  is 

ho  '   ZZ  p    .  {w  z  f(z  ,a)  +  (1-w  )  /  z  f(z  ,a)dz  } 
s-t,i   s  s  v  s*  '      s'  '    .  s    s'  '     s 

x  {w  z  f(z  a)  *  Cl-wJ  /   zt  f(z  ,a)dz  } 
t  t    t  (t)  l        c 

T 
x  n  (w.  f(z.,o)  *  (1-w  )  /   f(z  a)dz  } 
j-1   3    J        J  (j)    J     J 

where  /   is  the  integral  over  (-°°,-B  x  ).   Now  abbreviate  f(S  x^.O)    =  f t ,  F(S  xt,o)  = 

(t) 
F  ,  and  note,  partly  for  future  use,  the  properties  (cf.  [1],  p. 1001) 

E(wt)    =  Ft   ,   E(utwt)   =  a2ft,   E(ut2wt)   =  a2(Ft-B'xtft), 

(-1.5) 


E(ut3wJ   =  a2ft  C(3'xt)2  *  2a2].  E(ut4wt)   =  ff2[3a^t-  3oVxtft-(8  xt)3ftl, 


Then        (3/3*  )exp{L(8,0) }   can  be  expressed  as 

^"2  ZZ'ps-t,i   (ws:sf(S'0)-(1-Ws)fl2f5  } 
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T 

X  {w   z   f(:    ,o)-(l-w Jff:f   }     n     {w .f(z .,o)*(l-w  )(1-F  )} 
tit  t  ti_jjj  j  j 

where  we  can  now  replace  z  by  u  . 
Because  L(6,0)  is  given  by  (2.2),  we  deduce  that  8.(9)  has  ith  element 

(4.4)  £.(9)  ■  HO'2   EE  'ps.tj.  vsvt  -  a'2   £  pk.  J^  vtvt.k 

where 

(4.5)  v  =  w  u  -  (1-w  )o2f  (1-F  )_1 

v  '  t    t  t   v  tJ       tv  tJ     • 

Now  we  evaluate  M(6).  Under  H  ,  the  v  are  clearly  independent,  and  because 
of  (4.3) 

(4.6)  E(v  )  =  0,  V(v  )  =  a2\    , 

t         *      t 

where 

(4.7)  xt  -  Ft  -  s'xtft  ♦  a\2a-Fty\ 

The  v  can  be  regarded  as  (heteroscedastic)  "residuals"  from  the  Tobit  model.   On 

i 

using   (4.6)   we  see  that   E{2.(9)£(9)  }     has    (i,j)th  element 

7-1  T 

(4.s)         h  ir    p       .  ps      .  n    =  i    p.  .p..    e       n      . 

S-t,l       i-t,J        S     t       j._j       kj.     KJ     t-k+i       z    t-K 

To  find  EU(6)m(0)    }  we  obtain 

.->       T 
(3/36)L(9,0)   »    O  E     xv 

t=l     C  r 
T 
O/3a2)L(9,0)   =  '^"4  E  {w Aul-o2)  *   (1-wJgV   a2f  (1-F  l"1}. 

^       •  C  t  t  L  t  C 

Because  the  v  are  independent  with  zero  means  it  immediately  follows  that 
E  {Jl(9)m(.9)  }  =  0  under  HQ.   Because  T  E{mt9)m(9)  }  is  nonsingular  under  the  conditions 
of  [1],  at  least  in  a  neighbourhood  of  the  true  9  as  T  ■*  °°,  we  have  therefore  the  simple 
form  M(6)=E{£(9)«.(9)  J.  The  LM  statistic  is  given  by  (4.2)  with  2.(9)  having  ith  element 

1.(6),  a-  ^  Pki  d^  ,     d^  -^  vtvt_k 
and  M(9;  having  (i,j)th  element 


V°]  'J/kiV^'     s-j,  Vt-k, 
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where 

(4-9)  v.  =  wt(yt-  8  *xt)    -    (l-wc)   a2  ft    Cl-Ft)_1 

(4.10)  Xt  =  Ft  -  6   'xtft  +  a  2  P  (l-Jp-1   , 

ftand  Ft     beinS  ft  and  F     with  9  replaced  by  8. 

The  asymptotic  distribution  of  LM  under  H     does   not  follow   from  existing 
results,   partly  because  of   the  "mixed"  nature  of    the    likelihood   and  partly 

because  of   the  broad  nature   of    the   alternative    to  H    ,   so  we   present  a  rigorous, 

i 

but  abbreviated,   proof.      Let  .-ssumptions    1,    2,    3  of   [1]   and  C3     of   C26]   hold, 

-1      " 
plus    the  assumptions   of  Section  3.      First  we  show  M  =  plim-^T     M(9)    exists   and 

-1 
is  nonsmgular.      Put  T     M.  .  (9)    =  a..    +  a.  where 

H-:  -1  T"1  -1 

ai   -=       pki   °kj    T     V  a2\ZHPki   Pkj    T     Ck- 

Now  under  Assumptions   1   and  2  of  [1],   S     x     and  a"     are  bounded  and  F      is  bounded 

T-l  ->  2  h 

away   from  :ero.      Thus   for  some   C  <  »,    |a,|<CE     „    |p       p,  .  I <  C(E,,p     "  I     p~)    ,   which 

can  be  made  arbitrarily  small   for  suitably   large  H  by  Parseval's   theorem  and  the 
square-integrability  condition  on    (3/3^)S(w; 0) .      3y  application  of  Theorem  2  of  [1] 
and  the  mean  value   theorem,   A     =  X     *  0    (T       ]     can  be  established,    the  error  being 
uniform  in  t.     Thus   8     can  be  replaced  by  9   in   a,, which  then   converges   as  T  •*» 
to  a  limit     under  assumption   C3     of  [26],      On  increasing  H,  M  is   obtained.      The   fact  ' 
that   this    limit   is   nonsingular  can  be   seen  on  noting  that  X     >   F     by   a  well-known 
inequality   for  Mills  '   ratio,    so  because   F     is  bounded  away   from  zero  the   smallest 

eigenvalue  of  M  is   greater  than  that  of  a  positive  scalar  multiple  of  a  matrix  which 

i      * 
equals   (3.5)   by  Parseval's   theorem.    To  deal  with  T~  :2,(9)    note   that  the  denominator 

a-  =  ij-  *  0    (T    \)([1]),    and  write  v^.   =  v     +  h    (9-9)    where,   by  the  mean   value  theorem, 

4  '*  *■    ->      5 

EMh    ||    <   C  <  °°     under  our  conditions.      Put   £^(9)    =  a~~  Z     b.  ,    where 

H-l  T-l  „        ,T-1 

bl   =   \      Pki    V    b2   =  J      Pki    V    b3   =    (9"9)      Z   pki   \(*~V. 
in  1 


',  =    J    pki<  ^-e),b5=  iU)'™^.  c,, 


k  =  zVt-k«    *k-atht-k-    Bk  =  Zvtht-k,  SaZhtVk, 


Considering  b„  first,  note  that 


o 


IAj.II2   <  T  ECE  !  Ihr  1  I  4E  I  lht_k  |  i4)4  =   0(T2) 


and  thus   for  some  J,    1   <  J  <  T-l, 

Eiiyvk"  =  E||Jfp^  +  Tp^M 

»       ,  J-l  -  .  T-l    ,  T-l  ,  . 

S(Z  pk-     E     EMA^II-)2  ♦    (  Z  pk     E     EMAjJlV. 

On  choosing  J  such  that  J  ■*•  •,  J/T  ■+  0  as  T  •*  «     both  components  of  this   are 

3/,  *  _j4  L 

o(T    -).      Then  8-6  =   0   (T    ")    ([1])    implies  b.    =  o   (T 2) .     The  same   result   is  obtained 
P  J  P 

for  b,   and  b.   on  noting  that,  because  w     is   the  only  stochastic  component  of  h    , 
E||Bkl|2  =  E(ITvtvsht-khs_k)    <  C  E  EI|ht_kH2=0(T). 


Tnus  b_,  b     and  b.  make  no  asymptotic  contribution  to  the  distribution  of 

-h     * 
T     2,(0).     Finally  we  apply  Bernstein's   lemma  to  b     and  b.,.     First 

,         T-l  T-l   , 

E(b")    =  E  E    pkpz  U\Qt)    <   CT    E    p" 
H  H 

so  because  H  is   arbitrary  T"'-b,  ■  o   (1) . .      On  the  other  hand  application  of 
Liapounoff's   central    limit   theorem  and  assumption  C3     of  [26]   shows   that 

T   -(e1,...,eH)    converges,   as   T  +  »  but  H   stays   fixed,    to. an  H-variate  normal   variable 

-h  1*1* 

so   T     bj    is  asymptotically  normal.    Letting  H   increase,    T    22.(9)    is   asymptotically  normal 

with   zero  mefi  and  covariance  matrix  M.      (Our  square-integrability  condition  on 


(3/3'j;)S(w;0)    is     clearly  necessary   for  this   result.)    It   follows   that   LM  has  the 

2 
usual  asymptotic  y~  distribution  under  Hn. 

An  approximate     form  of  the   LM  statistic   (4. 2) ,   which  is  useful   in  those   cases 
where  an  expression   for  S(dj;'J;)    is  easier  to  obtain   than  one   for  P-(ip),      uses 

**  *\        —  1  1  y\  m 

i.  (9)    =  a'"     E      Ca/3*)S(«.;0)|    E     vr   exp   (it«t )  I". 

1  k=l  K         t-l     Z  ' 

T         T  T     ^  , 

M.  .(S)    =     E         E      (3/3*.)S(ul;0)  (3/3^). )S(a»;0)|  Z     X     expCit(u. -w.) ] I    , 
1J  k=l     1=1  1         K  J  *       t»l     Z 

where  u  =  2TTk/T,  suggesting  possible  use  of  the  fast  Fourier  transform. 

Finally  we  deduce  the  form  of  LM  for  testing  against  alternatives  (5.4)  and 
(3.5).  For  both  the  AR(p)  (5.4)  and  the  MA(p)  (5.5), 
(4.11)  Pki  =  1,  when  i  =  k;  pki  =  0,  when  L  f   k, 


so  we  have  the  computationally  simple  form 


P   -> 


(4.12)  LM^  =  Tk^  "k  '   rk  =  T  V*"ck 
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As  the  degree  of  censoring  becomes  negligible  the  r   approach  standard  sample 
autocorrelations  and  LM  approaches  the  statistic  of  Box  and  Pierce  [7],  which  is 
often  used  in  a  pure  significance  test  (c.f.  [17]). 

5.   Monte  Carlo  Power  Approximations 

As  usual  the  LM  procedure  produces  locally  most  powerful  consistent  tests. 

Under  local  alternatives  of  the  form  if)  »  5  T  ,  where  5  is  a  fixed,  nonnull  vector, 

2 

the   LM  statistic  has    for  large  T  an   approximate  noncentral  v     distribution,  with 

-1    » 
noncentrality  parameter  T    <5  M(6)6.      For  information  on   finite-sample  and  nonlocal 

power,  we  turn  to  Monte  Carlo  simulations. 

Two  Tobit  models  were  employed,   one  of  them   (2.4),  which  we  now  call 
Model    l.with     3=   1.0  only,    and 

Model   2    :   yt   =  mrxCSj   +   8:>x2t   *   03*3t   +  ut'0)' 
where   8.    =   -6.0,  S  _  ■   2.0,    (3-  =   0.5   and  x       and    x       were  generated   independently 

I  —  *^  -L  jt 

as  uniform  [0,2]  and  [0,20]  variates,  respectively,  and  the  same  realization  of 
these  was  used  for  all  replications,  to  accord  with  the  assumptions  made  in  [1], 
[26]  and  above,  that  x  is  nonstochastic.  More  interesting  x    and  x.  sequences 
could  have  been  employed,  reflecting  serial  correlation  in  the  exogenous  variables, 
but  such  modifications  seem  unlikely  to  affect  our  results. 

The  first  part  of  the  investigation  aimed  to  examine  the  relevance  in  small 

and  moderate  samples  of  the  x~  critical  region  for  LM  established  in  the  previous 

P 

section.   In  both  Models  1  and  2,  500  replications  of  white  noise  u  sequences  were 
generated  with  T  =  30,50,70,100  and  150,  and  LM(p)  given  by  (4 .  12)  calculated  in  each 
case,  for  p  =  1,2,3.  Averages  and  variances  of  LM(p)  were  computed,  and  compared 
with  the  asymptotic  values  p  and  2p,  respectively.  Type  I  error  probabilities  were 
estimated  by  the  proportion  of  times  LM(p)  exceeded  the  y~  10-.  critical  values.  The 
results  appear  in  Tables  III  and  IV,  for  Models,  1  and  2  respectively.   The  number  of 
replications  is  too  small  to  provide  very  accurate  estimates  of  the  distribution  of 

LM(p),  as  the  lack  of  monotoni city  with  increasing  T  suggests,  but  the  results  do  at  least 

2 
provide  a  rough  picture.  On  the  whole  the  x"  approximation  does  not  come  off  badly, 

even  for  the  smallest  values  of  T,  but  there  is  little  improvement  with  sample  siie 

over  the  range  considered.   The  finite-sample  distributions  tend  to  have  lighte-  upper 
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tails,  and  this  is  not  surprising  in  view  of  the  fact  that  in  some  other  situations 
the  LM  statistic  can  be  shown  to  be  numerically  less  than  other  statistics  with 

equivalent  asymptotic  local  power. 

(Tables  III  and  IV  about  here) 

We  go  on  to  examine  the  power  of  our  test  against  alternatives  specified  by 

S(^Hl-0.4ei(V:,  (AR(D). 


Alternative  1 

Alternative  2 
Alternative  3 


S(u.)-|l-0.8eiwr2,  (AR(1)), 

S(uHl-0.4e1U)-0.4e2ia,r2,      (AR(2)), 


2 

where  as  in  Section  2,  the  innovations  variances  were  adjusted  to  produce  o~=  2 

in  each  case.  Alternatives  1  and  2  were  considered  to  illustrate  the  effects  of  weak 
and  strong  autocorrelation,  and  Alternative  3  was  considered  primarily  to  investigate 
what  happens  when  p  in  LM(p)  is  chosen  too  small,  specifically,  when  p  =  1.  We  call 
the  latter  situation  "undertesting".   We  did  in  fact  compute  LM(p)  for  p  =  1,2,3  for 
each  alternative,  LM(1)  providing  an  efficient  test  against  Alternatives  1  and  2,  and 
LM(2)  being  efficient  against  Alternative  3.   Information  was  also  thereby  gained  in 
"overtesting",  when  the  test  statistic  overrates  the  alternative.   In  both  undertesting 
and  overtesting,  some  loss  of  power  is  to  be  expected.  Power  was  measured  as  the 
proportion  of  times  empirical  10%  significance  points  were  exceeded,  these  being  the 
appropriate  order  statistics  based  on  data  generated  under  the  null  hypothesis. 

For  each  combination  of  Model,  Alternative  and  T-value,  500  replications 
were  generated  and  the  results  are  displayed  in  Tables  V  and  VI.   We  make  the  following 
comments.   (1)  A  comparison  of  the  two  tables  suggests  that,  particularly  in  small 
samples,  power  varies  inversely  with  the  number  of  6  parameters.   (2)  Power  increases 
monotonically  with  T  but  the  small  sample  results  are  slightly  disappointing,  at  least 
in  part,  unless  placed  in  perspective:  even  under  H-.  only  asymptotic  properties  of 
d  are  established  in  Cl]  and  these  may  produce  misleading  inferences  in  small  samples; 
our  LM  statistic  depends  on  8  ,  in  fact.   (3)  As  expected, powers  for  Alternative  2 
always  exceed  those  for  Alternative  1,  and  LM(2)  almost  never  does  better  than  LM(1) 
for  these  alternatives.   (4)  In  small  samples,  loss  of  power  seems  to  vary  with 
"degree"  of  overtesting.   For  example,  in  Table  V  under  Alternative  2  with  T  =  50, 
LM(1)  has  power  0.854,  while  the  powers  for  LM(2)  and  LM(3)  are  respectively  0.718 
and  0.674.  This  effect  decreases  as  T  increases,  as  is  expected  because  each  test  is 
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consistent  against  many  alternatives,  not  merely  those  included  in  H  .   (6)  Loss  of 
power  caused  by  undertesting  can  be  seen  by  comparing  the  results  for  LM(1)  and  LM(2) 
under  Alternative  3;  again  this  seems  important  only  in  the  smaller  samples.  The 
fact  that  LM(3)  is  always  better  than  LM(1)  in  this  case  appears  to  suggest  that 
overspecification  of  the  alternative  is  better  than  underspecification . 

(Tables  V  and  VI  about  here) 

6.   A  Test  for  the  Truncated-  Model. 

Many  economic  variables  take  only  non-negative  values,  and  yet  cannot  be 
described  by  the  Tobit  model  because  they  have  no  atoms  of  positive  probability. 
Models  based  on  the  truncated  normal  distribution  can  describe  such  data,  as  an 
alternative  to,  for  example,  linear  models  for  logged  variables. 


The  model  is  defined  by  the  assumption  that  (y .,..., y_)  has 


df 


2  -i  -> 

[l-4>(0;XS,a  R('iO)]       $(S;X3,a~R('i) )   when   £   is   a  vector  of  positive  elements,  and  zero 

otherwise.      Therefore  the   log-likelihood  is 

L(9,i|0   =   in  O(y;XS,a2R0lO)    -   In  [  l-<H0;XB,a2R(iJ0 )  ]. 
N'ow 

L(9,0)    =  Z     {in  f(v     -  s'x     al    -  ZnF(8  x   ,o) } 
t=l  t  Z  z 

but  for  y  /  0  L(6,'|0  will  generally  be  unmanageable. 

To  conserve  on  space  we  list  only  the  formulae  for  the  LM  statistic  (4. 2), which 

•  » 

are  derived  as    in  Section  4.  With   g     =f/F,     u     =y     -8x     we   have 

st  t     t'        t        7t  t 

2  T-l  T 

I.  (6)   =     a        Z     p.  .     Z     (u  u     .    -  a  g  g     ,  ) 
1  k=l  k+1  stst-kJ 

•  _?  >•  5  -2     *2       2  2     ' 

m(5)       *  a        £     Cxt   [ut  -  a"gj    ,  hP     (ut     -a     ♦  a"  8  x^)] 

E{£  (3H.(6)}  =     Z       p     p       C  Z     {(1-8  x  g  J  (1-8  x     ,g       )a     *t  «t_k>J 
i  k=l       kI     -      k*l  z   *  r       t_k 

)  ■    0  .(1-3   xg      -0"~e")gg 

s-t,i     s-v,jl  s5s  5s   'Hav 


a2  III     p 


s*tpv 


where 


-    IS   - 


EU.  (8)m(6)}   =   21     p.  .      Z 


k=l     kl   k*l 


E{ra(8)ra(8)    }  = 


*t(i-e  xt«t  -  °2gt2)gt. 


■    -2,   2         2   ' 


Xtgt  +   (0  V   >Mt-k 


11 


A12 


4,  -  a"2  J     xtx;   (l-s'xtgt  -  a2gt2) 


l2  =  ha'4    I     xtgt(a2  ♦  a2s'xtgt  ♦  (s'x^2) 


A„=W'6     E  {2a2 
t=l 


2     2 


a"B  x  g     -  a  (8   x  )   g       -   (0  xr)   g  } 
t  t  t       t  t     *t 


The  LM  statistic  is  less  simple  in  form  than  in  the  Tobit  case,  because 

i 

E(Jt(8)m(8))^0,  although  some  simplification  results  under  alternatives  (3.4)  and  (3.5) 

2 
which  again  produce  che  same  statistics  (use  (4.11)).   An  asymptotic  y  distribution 

can  be  established  for  LM  under  the  same  conditions  as  before,  plus  the  condition  that 

for  each  positive  integer  pair  j,  k,  j  <  k,  the  empirical  df  of  (x.,x  .,x  .),..., 

(x__k  *j.+j  k»  XT)  converges  as  T  -  »  to  a  df. 

7.  Models  Containing  Lagged  Dependent  Variables 

Regression  models  for  economic  time  series  frequently  contain  lagged 
dependent  variables.  Tests  for  serial  correlation  of  residuals  in  such  models 
have  been  given  by  Durbin  [10],Breusch  [8],  and  Godfrey  Tl3].  Naturally, 
one  would  like  to  consider  such  questions  for  LDV  models,   and  by  analogy  with  the 
standard  uncensored  case  serial  correlation  would  be  expected  to  produce  inconsistent 
parameter  estimators,  unlike  in  models  containing  no  lagged  dependent  variables. 
Clearly  the  conditions  imposed  in  [1]  ,  [26]  exclude  such  variables,  and  so  the 
results  there  cannot  be  applied  to  derive  the  asymptotic  distribution  of  even  a  LM 
statistic  for  testing  white  noise  residuals,  and  no  immediately  applicable  results 
are  currently  available.  However  the  LM  principle  does  produce  statistics  which 
should  at  least  provide  informal  indicators  of  serial  correlation,  and  so  we  shall 
briefly  explore  the  lagged  dependent  variable  case. 
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We  consider  only  models  of  Tobit  type,  and  observe  first  that  (2.1)  can 
be  extended  to  include  lagged  dependent  variables  in  at  least  two  ways.    The 

first  of  these,  proposed  by  Robinson  [24],  and  called  here  Model  A,  postulates 

•  •       •       • 

an  unobservable  variable  y  ,  and  y  is  given  by  (2.1)  with  Y    ■  (y   ,...,y   ) 

the  first  L  elements  of  I  .   In  other  words  y  is  given  bv 

,     i  •      i  • 

Yt     =  a  Yt_1  ♦  Y  -t   *   i»t  ,  y  =  max  (yt>0), 

•   i     t 

z     being  the  last  J=K-L  elements  of  x  and  (a,  y  )  =  B  .  (The  notation  in  [24] 

t  t 

differs,  in  particular  y  and  y   are  interchanged.)   Introduce  the  T  x  T  matrix 


A  = 


0 


0     ...  0  a,  ...  a   0 
a.  being  the  jth  element  of  a.  Then  from  f24]  the  likelihood  for  Model  A  with 

y'o  -  •••  =>-;.l  =  °    1S 

/.../  ♦Gr;(l-4)"1ZYI  a2(l-A)"1R(*)d-A)"1  )Il(dyt)   c, 

where  I  is  the  T-rowed  identity  matrix  and  Z  is  the  T-rowed  matrix  with  tth  row 
!  .   .As  noted  in  [24],  even  for  white  noise  u  this  likelihood  involves  a  T-U- 
diraensional  correlated  normal  df,  except  when  censorings  are  suitably  "sparse". 
Thus  even  LM  tests  of  (5.1)  cannot  generally  be  contemplated.  One  test  that  is 
possible  is  that  the  lagged  unobservable  dependent  variables  can  be  omitted,  that  is 
2  =  0,  the  LM  statistic  depending  on  only  univariate  normal  probabilities.   The 
statistic  can  be  readily  derived,  although  it  is  not  of  so  simple  a  form  as  those 
derived  in  Section  4.   It  has  an  asymptotic  x,~  distribution  under  the  same  con- 
ditions as  before. 

The  second  model,  called  Model  3,  simply  includes  lagged  observed  dependent 

variables  in  x  .   Putting  r  ,  =  (y   ,  , . . .  ,y   .),  (2.1)  is  then 
t        5  t- 1     t- 1      t-L 


(7.1) 


v     =  max  (a  Y   ,  *  Y  :  +  u  ,  0) . 
•  t  t-1      t    t' 
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First  suppose  u  is  white  noise.  Writing  the  likelihood  as  a  product  of  conditional 
likelihoods,  and  noting  that  the  Markovian  structure  of  (7.1)  means  that  the  con- 
ditional distribution  of  y  given  y  , ,  ...,y  depends  only  on  y     ,,,.., y    for 
t  >  L,  we  have  exactly  the  same  form  of  log  likelihood  as  (2.2)  with  x  =  Cy     z  ), 
so  9  is  easily  computed.  However,  this  x  does  not  satisfy  the  conditions  in  [11 
so  we  are  not  in  a  position  to  assert  asymptotic  properties  of  9  .   On  the  other 
hand,  LM  tests  -  or  Wald  or  LR  tests  for  that  matter  -  of  a  ■  0  can  be  rigorously 
justified,  as  before  .  The  LM  statistic  is  of  a  slightly  different  form  from  that 
for  Model  A,  mentioned  above.  The  same  type  of  test  can,  incidentally,  be  used 
more  generally  to  determine  whether  elements  -  stochastic  or  deterministic  -  can 
be  omitted  from  x  ;  this  sort  of  question  can  be  handled  by  an  exact  finite-sample 
F-test  for  the  uncensored  linear  model  under  our  conditions. 

An  important  implication  of  the  computational  tractability  of  the  likelihood 
for  Model  B  with  white  noise  u  ,  is  that  LM  tests  of  white  noise  will  be  easy  to 
compute.  Serial  correlation  in  u  ,  even  of  the  autoregressive  type,  destroys,  the 
Markovian  structure  of  the  model.   To  obtain  the  likelihood  introduce 


Y  =  (Yv...,yT)      ,     U  =  Cu1....,ttr),(  W  =  diag  {wlt...,w  }. 


The  df  of  Y  is  thi 


P(Y<0  =  P(W(AY  *  ZY  *  U)<-)  =  P(AY  *  ZY  *  U*0 
if  ;  has  nonnegative  elements.  The  last  expression  is 

P({A(I-WA)"1W  ♦  I}(ZY+  U)<;)  =  Pf(I-AW)_1(ZY  ♦  U) S?) 


!,-„ 


(7--)     =  £  p((i-aiv)"1(:y+u)<c,  w  =  w.), 


i 


where  W.  is  a  diagonal  matrix  such  that  each  diagonal  element  is  either  zero  or 
one,  and  the  sum  is  over  all  2T  such  matrices.  (W  is  a  random  matrix.)   Now  the 
event  that  w  =  1(0)  is  the  event  that  the  tth  element  of  (I-AW) _1 (Zj+U)  is 
positive  (nonpositive)  and  because  this  tth  element  depends  on  the  w.  only 


tor  j<t  it  follows  that  the  event  that  W=W.  is  equivalent  to  the  event  that 

.1 
(I-AW.)  *(ZY -U)  has  elements  which  are  positive  or  nonpositive  depending  'on 

whether  the  corresponding  diagonal  element  of  W.  is  1  or  0,   Thus  {7.2)    is 


(7.3) 


Z  P(-(I-W.)««(I-AI».)"  (ZY*U)S  w.o 

i 


where  by  »  are  mean  here  the  vector  all  of  whose  elements  are  "infinity",  and 
"infinity"  xO  =  0.  On  differentiating   (7.5)  with  respect  to  elements  of 
5  corresponding  to  positive  y  in  the  available  sample,  and  then  substituting 
the  observed  y  for  X,   it  is  seen  that  only  one  of  the  summar.ds  in  (7.3)  makes  a 
contribution,  the  one  for  which  W.  corresponds  to  the  configuration  of  w  in  the 
available  sample.  When  YQ  -   •••  =  Yi.i  =  "  this  contribution,  the  likelihood  is 
clearly 


,  T 


1-w. 


/•••  /3(y;(I-AW)~1:Y,  CT2(I-Ah')"1R(i|»)(I-AW)"11)  n   (dyj    Z, 


t=l 


reducing  to  (2.2)  when  <\>   =  0.   Thereupon  we  propose  the  statistic 

L>!  =  1(8)  (D-E  G  E  )£(6)  where  Z(6)  is  given  by  (4.4)  and  D,  E  and  G  are  as  follows, 

The  matrix  D  has  (i,j)th  element 

*  «?  '  T     *     *,  - 

a  "I   ov;Pi.n-  £  A  v  ,  , 
k«l   kl  k^  k+i  c  t-k* 

where  v   and  A  are  given  by  (4.9)  and  (4.10),  the  true  A  being  given  by  (4.7) 

2  2 

but  being  derived  from  E(v  |y  .,..., y  . )  =  a  A  :  E  has  i  th  row 
5  t  ''  t-1'   "  t-L      t 


k=l     k+1 


a"  -  a"  8  x.  f.(l-FJ 


t  t 


G, 


where  (see  (6.16)  -  (6.13)  of  [1]) 


a-  £  U  xt  ft  -  a"  f-  (1-Ft)   -  Ftl  x^  , 


1 


Z     ft((3  xt)"  ♦  a"  -  a-  S  xr  ft(l-Ft)   ;  xt, 


G.  =  ^6  EU9V)3?,  *  a2  5'xt  f  -  (B'xt  v'd'V  -^ 
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This  LM  is  not  actually 'LM  defined  by  (4.2).   In  the  uncensored  lagged  dependant 
variable  case  the  information  matrix  under  H.  is  typically  replaced  by  a 
consistent  estimator  (see  [13]  for  example).  We  have  attempted  to  do  the  same 
thing  here  but  we  do  not  understand  the  process  ( 7..  1) sufficiently  well  to  assert 

that  this  modification  does  not  alter  the  asymptotic  distribution,  and  in  any 

2 
case,  of  course,  we  cannot  verify  a  x    asymptotic  distribution  for  either  LM 

P 
or  LM  in  this  case.  On  the  other  hand,  if  0  is  consistent  for  9  (which  again 

we  cannot  assert)  then  LM  will  clearly  reflect  serial  correlation  in  u  and  there 
is  no  reason  why  it  should  not  at  least  be  used  in  an  informal  fashion  in 
applied  work.  Work  is  currently  under  way  to  establish  asymptotic  properties  of 
estimators  of  a  generalisation  of  model  (7„1). 

8.  Final  Comments 
We  have  studied  Lagrange-multiplier  tests  for  serial  dependence  in  a 
number  of  econometric  models  containing  limited  dependent  variables.   Serial 
dependence  does  not  produce  inconsistency  in  the  usual  point  estimators  based  on 
serial  independence,  but  our  simulations  in  Section  2  demonstrate  that  its  effect 
in  finite  samples  can  be  considerable,  and  under  serial  dependence  the  usual 
estimators  are  asymptotically  inefficient  and  use  of  the  usual  formulae  for  asymp- 
totic covariance  matrices  is  invalid  and  possibly  productive  of  misleading 
inferences.   The  LM  test  has  been  applied  in  many  settings  in  recent  years,  and  its 
use  generally  involves  some  computational  savings   relative  to  other  asymptotically 
locally  most  powerful  strategies,  such  as  the  Wald  and  likelihood  ratio   tests. 
The  potential  savings  afforded  by  the  LM  test  in  our  models  are  much  greater  than 
in  perhaps  any  of  the  other  applications  that  has  been  studied,  and  the  alternative 
strategies  can  be  contemplated  only  in  very  special  situations.   All  our  test 
statistics  have  been  derived  against  a  very  general  class  of  serially  correlated 
alternatives,  which  is  of  some  independent  interest  as  it  can  be  used  beyond  the 
LDV  context.   In  the  standard  Tobit  model,  the  test  statistic  is  of  a  particularly 
simple  form,  reducing  to  an  analog  of  the  usual  portmanteau  statistic  when 
autoregressive  or  moving  average  alternatives  are  specified. 


-  20  - 

We  have  rigorously  established  the  limiting  dis tribution  of  the  LM  test  statistic 
against  the  general  class  of  alternatives,  under  a  minimal  condition  on  the  class. 
We  have  used  simulations  to  investigate  the  null  and  non-null  distributions  of 
our  test  statistic  in  finite  samples.   We  have  also  studied  the  truncated  normal 
model,  and  indeed  our  method  of  derivation  and  asymptotic  proof  could  be 
directly  applied  to  a  number  of  other  models,  such  as  the  multivariate  Tobit  model 
(Amemiya  C2]),  probit  models  (Rosett  and  Nelson  [28]);  and  modified  to  deal  with 
models  for  markets  in  disequilibrium  (Maddala  and  Nelson  Cl9]),nnd  work  is 
currently  in  progress  on  tests  for  the  latter  model.  We  have  derived  likelihoods 
for  two  Tobit  models  containing  lagged  dependent  variables  and  serially 
correlated  errors,  and  for  one  such  model  have  obtained  an  LM  test  statistic 
analogous  to  the  one  of  Godfrey  [13]  in  the  uncensored  case.   In  a  number  of 
instances  in  the  econometric  literature,  maximum  likelihood  estimators  have  been 
presented  whose  asymptotic  distribution  does  not  follow  from  available  theorems, 

and  is  not  derived.   This  is  the  case  with  the  Tobit  estimator  in  the  presence  of 

2 
lagged  dependent  variables,  so  we  cannot  assert  that  the  usual  x  distribution 

obtains  for  the  LM  statistic  in  this  case.   However  it  would  be  most  surprising 

if  this  were  not  the  case,  under  stationarity  conditions  on  the  lagged  dependent 

variable  coefficients  and  other  regularity  conditions,  and  the  matter  could 

in  any  case  be  investigated  by  simulations. 

It  is  necessary  to  discuss  the  options  available  if  the  LM  test  rejects  the 
null  hypothesis  of  serial  independence.   If  tests  against  a  variety  of  alternatives 
have  been  carried  out  then  more  than  one  of  them  may  reject  so  there  may  be  no 
unambiguous  choice  of  time  series  model  for  the  residual,  and  indeed  ML  estimation 
under  such  a  model  is  unlikely  to  be  feasible,  as  previously  indicated,   For  the  same 
reasons  the  preferable  approach  of  carrying  out  further  tests,  using  various  serially 
correlated  models  as  the  null  hypothesis  and  testing  against  "higher  order"  serial 
correlation,  as  in  Godfrey  [13],  cannot  be  contemplated.   Nevertheless  consistent 
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nonoarametric  estimators   of    the   residual    autocorrelations    can  be   obtained  without 
any  model   assumptions,    as    in   Robinson  [27].      From  these   a   time   series   model   for 
the   residuals   might  be   inferred,    if   desired,    although   again   the    likelihood   it 
implies  will  not  be   tractable.      However   the   autocorrelation  estimators    can  be 
inserted   into   the   formula   for   the   asymptotic   covariance  matrix  of    the   usual 
Tobit   estimators   under  dependence   obtained   in  Robinson   [26],    as   explained    there, 
with  or  without  use   of  an   inferred   time   series   model   for   the    residuals.      Thus    it 
is   possible   at   least   to  employ   correct   inference   procedures    for   the   usual 
estimators.      The  estimated   covariance   matrix  under  dependence   in  [26],    incidentally, 
might  be   employed   in   an  alternative    test  of  serial   correlation,   by   comparing  it 
with   the  estimated   covariance  matrix  under  independance ,   of  Amemiya  [1],    in    the 
way  proposed  by  White   [33]    in   a  different  context.      This    test   is    computationally 
far  more  onerous    than  our  LM  test,    however,    and  its   power  properties   against 
particular  alternatives    are   unclear. 

We  mentioned  earlier  in   the   paper   the  possibility  of  misspecification  due 
to  nonnormality   and  heteroscedas ticity ,  which,    unlike    in  uncensored  models,    cause 
inconsistency   of   point   estimators.      In   addition  one   must  bear    in   mind    the    usual 
problem  of   correctly   specifying  x     in  Tobit,    truncated  and   other  models.      Our 
LM  test  of  serial   dependence  will  not  be  valid   if   any  of    these    causes   of 
misspecification  are   present.      3y    the   same    token    test   for   these    types   of 
misspecification   that   assume   serial   independence    (e.g.    those   derived  by   3era,  Jarque    and 
L.ee        [6],   Jarque   and  Bera  [18])   will  not  be   valid    if    the   residuals    are   in   fact 
correlated.      One   solution   is    to  derive    the   asymptotic   distribution  of    (say)    the 
scores   based   on   serial   independence   under  serial   dependence,    along   the    lines   of 
Theorem  2  of   [26],    the   covariance  matrix  of  which  might  be  estimated.      An  alternative 
approach  would   involve    a  portmanteau    test   of    the    three    causes    of   misspecification, 
plus   serial  dependence,    as    suggested  by  Bera  and  Jarque   [5],    for  uncensored 
regressions;    the   LM  procedure    can   certainly  be  emnloved,    although  Wald   and   LR  tests 
for  nonnormality,   heteroscedas ticity  and  misspecification  of   x     may   also  be 
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feasible,    compared  with    chose    for   serial   dependence,    ac    least.      Such    tests   would  not 
be  very   powerful   if    there   is    departure   from  the  null  hypothesis    in  only  one   or 
two  directions,   so   it  may  be  prudent   to  carry   out   a  battery  of   tests,    treating 
the   individual   problems   singly,    and  in   twos    and   threes. 
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TABLE   V 
ESTIMATED  POWER  OF   LM    :    MODEL   1 


Sample  Power 

Alternatives  , 

Si:e 


LM(1)  LM(2)  LM(3) 


1  -616  -540  '423 

30  2  *984  -956  -940 

3  -753  -810  .768 


1  -854  -718  -674 

50  2  1-000  1-000  -996 

3  -924  -966  -946 


1  -914 

2  1*000 

3  -968 


1  -972 

100  2  l'OOO 

3  '996 


•874 

•852 

•ooo 

l'OOO 

•992 

•992 

•958 

•946 

•ooo 

rooo 

•000 

1*000 

1  -996  -994  -990 

150  2  1*000  l'OOO  1*000 

3  »998  1-000  1-000 
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TABLE  VI 
ESTIMATED  POWER  OF  LM  :  MODEL  2 


Sample  Power 

Alternatives  

Sue  


LM(1)  LM(2)  LM(3) 


1  '418  -314  -260 

30  2  -890  -838  -782 

3  -568  -640  -578 


1  -524  -432  '590 

50  2  -946  "948  '942 

3  -758  -820  -820 


1  -682  -S90  '548 

70  2  -992  '992  "984 

3  -878  -942  '940 


1  -854 

100  2  1-000 

5  -982 


1  -946 

150  2  1-000 

5  -994 


•780 

•730 

rooo 

1-000 

•998 

•998 

•900 

•878 

1-000 

1-000 

rooo 

1-000 
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